NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

What Should We Engineer in Prompts? Training Humans in Requirement-Driven LLM Use

https://doi.org/10.1145/3731756

Ma, Qianou; Peng, Weirui; Yang, Chenyang; Shen, Hua; Koedinger, Kenneth; Wu, Tongshuang (April 2025, ACM Transactions on Computer-Human Interaction)

Prompting LLMs for complex tasks (e.g., building a trip advisor chatbot) needs humans to clearly articulate customized requirements (e.g., “start the response with a tl;dr”). However, existing prompt engineering instructions often lack focused training on requirement articulation and instead tend to emphasize increasingly automatable strategies (e.g., tricks like adding role-plays and “think step-by-step”). To address the gap, we introduce Requirement-Oriented Prompt Engineering (ROPE), a paradigm that focuses human attention on generating clear, complete requirements during prompting. We implement ROPE through an assessment and training suite that provides deliberate practice with LLM-generated feedback. In a randomized controlled experiment with 30 novices, ROPE significantly outperforms conventional prompt engineering training (20% vs. 1% gains), a gap that automatic prompt optimization cannot close. Furthermore, we demonstrate a direct correlation between the quality of input requirements and LLM outputs. Our work paves the way to empower more end-users to build complex LLM applications.
more » « less
Free, publicly-accessible full text available April 24, 2026
Beyond Testers’ Biases: Guiding Model Testing with Knowledge Bases using LLMs

https://doi.org/10.18653/v1/2023.findings-emnlp.901

Yang, Chenyang; Rustogi, Rishabh; Brower-Sinning, Rachel; Lewis, Grace; Kaestner, Christian; Wu, Tongshuang (January 2023, Association for Computational Linguistics)

Full Text Available
Data Leakage in Notebooks: Static Detection and Better Processes

https://doi.org/10.1145/3551349.3556918

Yang, Chenyang; Brower-Sinning, Rachel A; Lewis, Grace; Kaestner, Christian (October 2022, ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering)

Data science pipelines to train and evaluate models with machine learning may contain bugs just like any other code. Leakage between training and test data can lead to overestimating the model’s accuracy during offline evaluations, possibly leading to deployment of low-quality models in production. Such leakage can happen easily by mistake or by following poor practices, but may be tedious and challenging to detect manually. We develop a static analysis approach to detect common forms of data leakage in data science code. Our evaluation shows that our analysis accurately detects data leakage and that such leakage is pervasive among over 100,000 analyzed public notebooks. We discuss how our static analysis approach can help both practitioners and educators, and how leakage prevention can be designed into the development process.
more » « less
Full Text Available
Subtle Bugs Everywhere: Generating Documentation for Data Wrangling Code

https://doi.org/10.1109/ASE51524.2021.9678520

Yang, Chenyang; Zhou, Shurui; Guo, Jin L.C.; Kästner, Christian (November 2021, Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), Los Alamitos, CA: IEEE Computer Society)

Data scientists reportedly spend 60 to 80 percent of their time in their daily routines on data wrangling, i.e. cleaning data and extracting features. However, data wrangling code is often repetitive and error-prone to write. Moreover, it is easy to introduce subtle bugs when reusing and adopting existing code, which result not in crashes but reduce model quality. To support data scientists with data wrangling, we present a technique to generate interactive documentation for data wrangling code. We use (1) program synthesis techniques to automatically summarize data transformations and (2) test case selection techniques to purposefully select representative examples from the data based on execution information collected with tailored dynamic program analysis. We demonstrate that a JupyterLab extension with our technique can provide documentation for many cells in popular notebooks and find in a user study that users with our plugin are faster and more effective at finding realistic bugs in data wrangling code.
more » « less
Full Text Available

Search for: All records